After completing this tutorial you should be able to
formulate meaningful questions based on a given data set
run a linear regression analysis using the tidymodels framework
understand potential impacts of climate change on the phenology of species
Download the directory for this project here, make sure the directory is unzipped and move it to your bi328 directory. You can open the Rproj for this module either by double clicking on it which will launch Rstudio or by opening Rstudio and then using File > Open Project or by clicking on the Rproject icon in the top right of your program window and selecting Open Project.
There should be a file named 23_Phenology.qmd in that project directory. Use that file to work through this tutorial - you will hand in your rendered (“knitted”) quarto file as your homework assignment. So, first thing in the YAML header, change the author to your name. You will use this quarto document to record your answers. Remember to use comments to annotate your code; at minimum you should have one comment per code set1 you may of course add as many comments as you need to be able to recall what you did. Similarly, take notes in the document as we discuss discussion/reflection questions but make sure that you go back and clean them up for “public consumption”.
1 You should do this whether you are adding code yourself or using code from our manual, even if it isn’t commented in the manual… especially when the code is already included for you, add comments to describe how the function works/what it does as we introduce it during the participatory coding session so you can refer back to it.
Before you get started we are going to need an additional set of packages that we will want to install first.
Now, let’s make sure to read in all the packages we will need.
# load libraries ----# reportinglibrary(knitr)# visualizationlibrary(ggplot2)library(ggthemes)library(patchwork)# data wranglinglibrary(dplyr)library(tidyr)library(readr)library(skimr)library(janitor)library(magrittr)library(lubridate)# modellinglibrary(tidymodels)# set other options ----options(scipen=999)knitr::opts_chunk$set(tidy =FALSE, message =FALSE,warning =FALSE)
23.1 Introduction to phenology
Consider this
Briefly define the terms phenology, phenophases, life history and life history traits and argue how you think climate change might impact these.
Did it!
[Your answer here]
We are going to explore a data set that contains information on the phenology of Minnesota Species to determine to which extent seasonal events are tied to climate or whether they are dependent on other factors. To do this we are going to use a data set generated by the Minnesota Phenology Network.
Consider this
Go to the Minnesota Phenological Network’s website and explore the short introductory and history statements in the “Home” and “About” sections. Check out the “Meet the species” page that presents Minnesota’s seven superstar species. Get to know them in the “Read more” sections.
Pick two of the seven superstars and list the specific phenophases that associated with the species.
Did it!
[Your answer here]
23.2 Explore the data set
We are going to use data from the Minnesota Phenology Network to explore how climate change might be impacting the phenology of species.
Use your exploratory analysis skills to get an idea of the dimensions of the data set, what variables are contained in the data set and what data types they are. What function can you use?
Write a brief description of the data set.
Did it!
[Your answer here]
Pointers
That’s right - skim() will give you all the information in one handy place!
skim(pheno)
Table 23.1: Data overview MN phenology network data set.
(a) Data summary
Name
pheno
Number of rows
54741
Number of columns
12
_______________________
Column type frequency:
character
9
numeric
3
________________________
Group variables
None
Variable type: character
skim_variable
n_missing
complete_rate
min
max
empty
n_unique
whitespace
day
0
1.00
1
10
0
4329
1
event
1
1.00
4
36
0
120
0
species_common_name
136
1.00
3
48
0
1827
0
genus
594
0.99
3
15
0
748
0
species
1566
0.97
2
24
0
1089
0
county
83
1.00
2
17
0
43
0
lifeform
1
1.00
6
7
0
3
0
group
136
1.00
4
22
0
17
0
invasive
53339
0.03
3
3
0
1
0
Variable type: numeric
skim_variable
n_missing
complete_rate
mean
sd
p0
p25
p50
p75
p100
hist
year
0
1
1997.56
16.53
1941
1986
2004
2011
2017
▁▂▂▂▇
dataset
0
1
6.56
4.46
1
2
7
12
13
▇▃▅▁▇
day_of_year
84
1
173.95
58.82
2
130
164
211
366
▁▇▇▃▁
You do still have to write your description though …
One thing you may have noticed is that we have a column day that contains the date - however, currently it is formatted as a character. This could cause issues down the line when we want to plot things.
Dates are notoriously difficult to deal with2. A useful package to deal with dates is lubridate.
2 Remember when we listed out the many, many, many ways we can write a date and how conventions might differ between fields and countries?
Let’s use functions from that package to create column called date that has the data type date, and while we are at it, we can also learn how to create new columns with the month and day.
pheno <- pheno %>%mutate(date =mdy(day), # converts character in format month day year to datemonth =month(date), # extract monthday =day(date)) # extract day# check classclass(pheno$date)
[1] "Date"
Now let’s get an idea on what data is contained in the data set.
Give it a whirl
Describe how you can use the skimr output to determine how many unique entries (categories) we have for our categorical variables. Then use a function to print life forms, groups of species, and counties are contained in the data set to the console/your report.
Did it!
[Your answer here]
Pointers
The dplyr verb that can help you out here is distinct() or you could apply unique() to the column itself, which is a vector.
# output unique entries as dataframepheno %>%distinct(lifeform)
# output unique entries as vectorunique(pheno$lifeform)
[1] "PLANTS" "ANIMALS" NA "ABIOTIC"
Give it a whirl
Create a table that shows the number of species in each group of species.
Did it!
[Your answer here]
Let’s focus on woody plants for now.
Give it a whirl
Create a new object called woody that contains only woody plants.
Did it!
[Your answer here]
Give it a whirl
Determine what events are recorded for species in this category. Create an overview table with events organized alphabetically.
Did it!
[Your answer here]
That’s a lot of events. Let’s try to get an idea of when these different events occur throughout the year.
Give it a whirl
Create a figure that effectively summarizes when different events typically occur throughout the year.
Did it!
[Your answer here]
Solution
Since we are interested in distributions, you would want to look at histograms or box plots - this is one way your figure could look like:
Figure 23.1: Distribution of life history events throughout the year for Minnesota woody plants.
23.3 Formulate a specific question
Since we are interested in whether climate change is impacting phenology, let’s chose a specific event that we think is likely to be linked to changes in temperature and could be a good indicator of changing phenology.
Consider this
Pick three events that you think could be good indicator events to look at and argue why you think they would be interesting to explore.
Did it!
[Your answer here]
Let’s start with the flowering date.
Give it a whirl
Create a subset called flowering that contains only records of flowering dates for woody plants. How could you plot this data to determine whether the flowering data has changed over time?
Describe how you expect the pattern to look like if the flowering date is occurring earlier, later, or not changing over time.
Plot the data and then use your predicted patterns to assess whether the flowering date is changing over time.
Pointers
This is what you want your figure to look like:
Figure 23.2: Change in day of flowering for woody plants in Minnesota 1941 - 2018. Fill of individual points indicates the month of the flowering date; linear trendline is fitted in red.
Let’s consider how useful this visualization is for answering our question. We combined all the woody species from all of Minnesota in the same plot. We probably don’t expect all the woody plants to react to changes in the same way or for that response to occur at the same pace. We should also consider that there could be differences based on the geographic location.
In this case, it might be more effective to narrow our question and pull out a specific species. Let’s use the American Elm (Ulmus americanus). This is a deciduous species with a geographic range throughout most of the eastern US and southeast Canada, i.e. Minnesota is at the northern limit of its range. Overall, this is a pretty hardy species that will grow to a considerable size and is frequently found in Urban settings. Like all elm species in Minnesota it is susceptible to the Dutch elm disease which caused by an invasive fungal pathogen.
Consider this
Create a new object called elm that contains only entries for the American Elm.
Did it!
[Your answer here]
Since the American Elm is so widespread, let’s also consider that we might want to eliminate geography as a confounding pattern.
Give it a whirl
Determine whether we have data for more than one county. Use that information to determine whether you want (need) to narrow down your data set. Explain your choice.
Did it!
[Your answer here]
It appears we have found our question!
Consider this
State the specific question we are asking and give a brief description of the data you will need and how you will analyze it to answer that question.
Did it!
[Your answer here]
Solution
Just so we are all in agreement - our question is: Has climate change changed the phenology of American Elm in Ramsey County, Minnesota?
Do we already have all the data we need to answer that question?
23.4 Determine changes in the flowering date of American elm in Ramsey County, MN
Our first step will be determining whether or not there has been a change in the flowering data of the American Elm.
Consider this
Make a prediction of how you think the flowering date of the American Elm may have changed over the 50 year time span recorded in our data set. Describe what your figure should look like if you are indeed correct, be specific and argue why you are expecting this pattern.
Did it!
[Your answer here]
Give it a whirl
Filter your data so that it only contains entries for Ramsey County.
Plot your data to determine if your prediction was correct; add a simple linear trend line to your plot to help identify the overall pattern.
Make sure to give your visualization a title and add a legend3.
3 You can do this using the chunk options (fig-cap) or directly on the figure itself
Did it!
[Your answer here]
Pointers
This is what your figure should look like.
elm <- elm %>%filter(county =="RAMSEY")ggplot(elm, aes(x = year, y = day_of_year, fill = month)) +geom_point(shape =21, size =3) +geom_smooth(method ="lm", color ="red") +scale_fill_viridis_c() +labs(x ="year", y ="flowering date") + theme_standard +theme(legend.position ="bottom")
Give it a whirl
Determine the rate of change (remember to include units!).
Did it!
[Your answer here]
Pointers
Hint: You will need to determine the equation of your linear trend line to do this.
Call:
lm(formula = day_of_year ~ year, data = elm)
Residuals:
Min 1Q Median 3Q Max
-21.7698 -5.7744 -0.6021 8.5442 19.2974
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 258.27107 146.61188 1.762 0.0836 .
year -0.07927 0.07439 -1.066 0.2912
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 10.81 on 56 degrees of freedom
Multiple R-squared: 0.01987, Adjusted R-squared: 0.002371
F-statistic: 1.135 on 1 and 56 DF, p-value: 0.2912
Don’t get too hung up on whether the result is significant or not. The rate is still the rate and you will want to include that in your results.
For your interpretation/discussion identify at least two possible mechanisms that could be causing this pattern. You will likely notice that you have a low R2 value and the linear regression is not significant. What does that mean in this context? How can you apply that to generating possible mechanisms for the pattern?.
4 Remember, results are more than just figures! Practice being specific, i.e. consider how much the flowering data has shifted and at what rate
Did it!
[Your answer here]
23.5 Determine temperature change in Ramsey County
If we want to investigate whether there is a relationship between climate change and change in the flowering date, we need climate data for Ramsey Country for the same time period.
Consider this
Describe what your climate data set should look like and argue why you want to include certain variables. Describe the pattern you would expect to see in your climate data over time to support your hypothesis of what is driving the pattern you uncovered in the phenology data set. Be specific.
Did it!
[Your answer here]
Pointers
Consider what the ideal temporal and spatial resolution would need to be to “match” your phenology data.
You ask and you shall receive:
Go to the DNR Minnesota Climate Trends website to access historical observations for specific Minnesota locations. Use the drop down menus options to select subsets using the check boxes to create a data set that meets the specifications you set out above.
Unfortunately, the download button doesn’t work. Instead, you should cut and paste the data into a text editor or excel and save it as a tab-delimited file in your data folder. Name it MN_temp.txt.
Lets read in your temperature data. Unless you added headers in the text file it’s missing them. Similar to the way we can use skip if there are additional lines, we have an argument we can use to specify column names if they are missing.
Plot your data to determine if your prediction was correct; add a simple linear trend line to your plot to help identify the overall pattern.
Give your visualization a title and legend and describe your results. Be specific (i.e. don’t just state the overall trend, consider how much temperature has changed). For your interpretation/discussion consider how the data differs from your predictions and determine whether it supports your hypothesis of the mechanism.
Did it!
[Your answer here]
Pointers
Here’s what your figure could look like:
Figure 23.3: Change in mean March temperature [Fahrenheit] for 1941 - 1991 (blue) and linear trendline (red).
I used a linegraph, which is an option for visualizing change over time as we are doing here in this time series. Using a line graph instead of a scatterplot is appropriate when plotting a categorical data on the x axis, similarly as we have discussed a line plot is an appropriate visualization of a time series.
We can add both a trendline and a line connecting individual points. This can be helpful because the linear trendline shows us the overall trend of the data while the line connecting the individual points helps us visualize local change from point to point.
For categorical data we can also use barplots:
Figure 23.4: Change in mean March temperature [Fahrenheit] for 1941 - 1991 (orange bars).
Why do you think it is better to use a scatter plot or line plot compared to the barplot?
Give it a whirl
For a good presentation of your results, e.g. through a presentation or poster, you would likely want to produce a figure where you have change in flowering date and change in mean temperature for March side by side. Practice making that type of figure here:
Did it!
[Your answer here]
23.6 Analyze relationship of flowering date and temperature
Give it a whirl
Especially when we look at the two figures side by side the fact we see that there is a trend for earlier flowering dates for American elm and warmer March temperatures. Describe the analysis you want to perform to determine whether there is a significant relationship between the mean temperature in March and the flowering date for the American Elm in Ramsey Co.. Include what your dependent and independent variables are5.
Create a plot to visualize the relationship but hold off on the analysis component until the next step.
5 Your are practicing writing up your methods. A good methods section includes not just what you want to do but why you are doing it/what that analysis is meant to achieve. A good way to start for this specific example would be e.g. “To determine whether {variable} depends on {variable}, we will {analysis}
Did it!
[Your answer here]
Pointers
To be able to plot temperature vs flowering date those two variables will need to be in the same data set. You can combine them using left_join().
Here’s what your figure could look like:
Figure 23.5: Relationship of mean temperature in March and flowering date for American Elm in Ramsey Co., MN.
You’ve already learned how to fit linear regressions using the function lm(). As you can imagine, there is a different function for various models (and occasionally multiple functions for the same type of models) and each comes with a slightly different syntax. It also might be more helpful to have the output e.g. in a data.frame or tibble.
There is a group of packages that has been designed to make interacting with models more consistent and processing the output in a more user-friendly way. You can install and load them as a group by calling tidymodels. From the name you have probably guessed that these packages have been designed to play well with the tidyverse.
Here’s how you can run a linear regression using parsnip which is the package designed to fit models using a tidy, unified interface. Essentially, in provides an interface to use different models behind the scene. This means that you can use a consistent syntax instead of having to figure it out for each model. You will see that having already used lm() without the interface that you are already familiar with the formula syntax that is used here.
The first step will be loading the tidymodels packages and specifying the model we want to use, in this case that would be a linear regression model.
linear_reg() # specify model
Linear Regression Model Specification (regression)
Computational engine: lm
That’s not too exciting, because after we specify the type of model we need to tell R which engine (method) it should use to fit the model. For example, to use an ordinary least square regression, we would set the method to lm6.
6 You could use the documentation page for linear_reg() to list all the possible engines.
Linear Regression Model Specification (regression)
Computational engine: lm
It appears we are getting somewhere, the last things we still need to do is use the function fit() to tell it what variables and we want to use to train/fit the model. We use the same formula syntax you are familiar with from using lm(). We will also use the function tidy() to convert the output into a handy tibble/data.frame.
# A tibble: 2 × 5
term estimate std.error statistic p.value
<chr> <dbl> <dbl> <dbl> <dbl>
1 (Intercept) 148. 4.60 32.1 1.51e-34
2 temperature -1.57 0.155 -10.1 1.35e-13
This doesn’t look too different from the output you are used to seeing but we will see down the line that there are additional benefits to using the tidymodels framework when it comes to doing more sophisticated analysis.
We’ve previously used regressions, mainly because were interested in the slope to be able to calculate a rate of increase or decrease over time. Now we are interested in a relationship between two continuous variables and whether or not one of the (the independent value, in our case temperature) has significant explanatory power for the dependent variable (in this case the flowering date). So let’s think about how we should interpret the results of our regression.
The (Intercept) term (slope) tells us that for temperature = 0 Fahrenheit, the expected flowering date would be 148. The temperature term tells us that on average for every one degree Fahrenheit decrease in Temperature, we would expect the flowering date to occur 1.57 days earlier.
Consider this
After first thinking about the purely mathematical explanation, we would want to think about what is biologically meaningful/sensible. Is there part of our interpretation above that biologically doesn’t entirely make sense? What does this tell you about extrapolating beyond the reach of your data?
Did it!
[Your answer here]
Consider this
Discuss your results, make a conclusion about the relationship between American elm flowering data and mean March temperature and then discuss that in the context of our original question.
Did it!
[Your answer here]
Consider this
Especially if you are writing a paper or report in the style of “Introduction/Methods/Results/Discussion”, a good discussion starts by reiterating the original (broad) question you asked with short description of how you specifically investigated it, your key results, and then interpreting/discussing those.
You can essentially follow a fill-in-the-blank-formula, which as you become more comfortable communicating your research will sound a little less formulaic and a little bit more you; adapt the following for your answer:
In this study, I investigated [broad question asked/hypothesis tested]. To achieve this, I [specific data set + analysis used]. We found that [key results: in your case you would make a specific statement about the trend of earlier flowering dates, the trend of warmer March temperatures, and the relationship of temperature and flowering date].
And then you would follow this with a brief discussion of whether your results and how they apply to your initial question, i.e. we are asking whether climate change will impact the phenology of plants - how does this specific example apply to that question?
Did it!
[Your answer here]
23.7 Now you!
We asked a pretty general question to start us off with and used a specific species and a specific phenophase to investigate. To gather further evidence or determine if there are other patterns to observe we would want to investigate the phenology of further organisms.
Good thing we have a very large data set to work with!
Consider this
Go back to the original data set with all the phenology records (object pheno). To find a good species to investigate we will want to make sure that we have sufficient data to make a meaningful statement about changes in the timing of phenophases.
Use your data wrangling skills to identify all species in the data set with at least 30 years of data for a specific phenophase and locations (i.e. you want at least 20 entries over at least 20 years).
Did it!
[Your answer here]
Pointers
You need to generate a table that looks like this - bonus, organize your table group of species, species, and by years of data available.
group
species_common_name
event
county
min_year
max_year
time_observed
n_observations
AMPHIBIANS & REPTILES
SPRING PEEPER
FIRST HEARD
ITASCA
1984
2016
32
35
AMPHIBIANS & REPTILES
WOOD FROG
FIRST HEARD
ITASCA
1984
2016
32
40
BIRDS
AMERICAN ROBIN
ARRIVAL
SHERBURNE
1975
2011
36
33
BIRDS
AMERICAN ROBIN
FIRST FLOCK OF MIGRATORS
ITASCA
1984
2016
32
36
BIRDS
AMERICAN ROBIN
LAST SEEN
ITASCA
1984
2014
30
56
BIRDS
BLUE WINGED TEAL
ARRIVAL
SHERBURNE
1980
2013
33
34
BIRDS
BUFFLEHEAD
ARRIVAL
SHERBURNE
1982
2012
30
30
BIRDS
CANADA GOOSE
ARRIVAL
SHERBURNE
1975
2011
36
31
BIRDS
COMMON GOLDENEYE
ARRIVAL
SHERBURNE
1982
2012
30
30
BIRDS
COMMON MERGANSER
ARRIVAL
SHERBURNE
1979
2012
33
33
BIRDS
DARK EYED JUNCO
FIRST SEEN
ITASCA
1984
2016
32
30
BIRDS
EASTERN BLUEBIRD
ARRIVAL
SHERBURNE
1975
2013
38
30
BIRDS
GREAT BLUE HERON
ARRIVAL
SHERBURNE
1975
2013
38
33
BIRDS
GREEN WINGED TEAL
ARRIVAL
SHERBURNE
1979
2013
34
32
BIRDS
HOODED MERGANSER
ARRIVAL
SHERBURNE
1982
2013
31
31
BIRDS
KILLDEER
ARRIVAL
SHERBURNE
1975
2013
38
34
BIRDS
LESSER SCAUP
ARRIVAL
SHERBURNE
1982
2012
30
30
BIRDS
MALLARD
ARRIVAL
SHERBURNE
1975
2011
36
32
BIRDS
NORTHERN HARRIER
ARRIVAL
SHERBURNE
1979
2012
33
30
BIRDS
NORTHERN PINTAIL
ARRIVAL
SHERBURNE
1979
2013
34
33
BIRDS
NORTHERN SHOVELER
ARRIVAL
SHERBURNE
1979
2012
33
33
BIRDS
PIED BILLED GREBE
ARRIVAL
SHERBURNE
1982
2013
31
30
BIRDS
RED WINGED BLACKBIRD
ARRIVAL
SHERBURNE
1977
2013
36
36
BIRDS
RING NECKED DUCK
ARRIVAL
SHERBURNE
1981
2012
31
32
BIRDS
RUFFED GROUSE
FIRST COURTSHIP/TERRITORIAL BEHAVIOR
ITASCA
1985
2016
31
48
BIRDS
SANDHILL CRANE
ARRIVAL
SHERBURNE
1980
2013
33
33
BIRDS
WOOD DUCK
ARRIVAL
SHERBURNE
1979
2013
34
35
FORB
BELLWORT
FLOWERING
HENNEPIN
1957
1991
34
32
FORB
BLOODROOT
FLOWERING
HENNEPIN
1957
1991
34
32
FORB
BLOODROOT
LAST FLOWER
HENNEPIN
1957
1991
34
33
FORB
BRIDAL WREATH
FLOWERING
RAMSEY
1941
1991
50
48
FORB
CANADA VIOLET
FLOWERING
HENNEPIN
1959
1991
32
30
FORB
CAROLINA PUCCOON
FLOWERING
HENNEPIN
1960
1991
31
31
FORB
COMMON MILKWEED
FLOWERING
ITASCA
1984
2016
32
31
FORB
COMMON SAINT JOHN’S WORT
FLOWERING
HENNEPIN
1960
1991
31
40
FORB
CROWFOOT
FLOWERING
HENNEPIN
1961
1991
30
30
FORB
CUT-LEAVED TOOTHWORT
FLOWERING
HENNEPIN
1960
2001
41
38
FORB
DANDELION
FLOWERING
ITASCA
1985
2016
31
33
FORB
FALSE RUE ANEMONE
FLOWERING
HENNEPIN
1960
1991
31
30
FORB
FALSE SOLOMON’S SEAL
FLOWERING
HENNEPIN
1960
1990
30
30
FORB
FIREWEED
FLOWERING
ITASCA
1984
2016
32
31
FORB
GOLDEN RAGWORT
FLOWERING
HENNEPIN
1960
1991
31
37
FORB
LARGE-FLOWERED TRILLIUM
FLOWERING
HENNEPIN
1958
1991
33
32
FORB
LARGE-FLOWERED TRILLIUM
LAST FLOWER
HENNEPIN
1958
1991
33
32
FORB
MARSH MARIGOLD
FLOWERING
HENNEPIN
1959
1991
32
31
FORB
MINNESOTA TROUT-LILY
FLOWERING
HENNEPIN
1960
1991
31
30
FORB
PURPLE TRILLIUM
FLOWERING
HENNEPIN
1957
1991
34
33
FORB
PURPLE TRILLIUM
LAST FLOWER
HENNEPIN
1957
1991
34
30
FORB
RUE ANEMONE
FLOWERING
HENNEPIN
1957
1991
34
33
FORB
RUE ANEMONE
LAST FLOWER
HENNEPIN
1957
1991
34
30
FORB
SHARP-LOBED HEPATICA
FLOWERING
HENNEPIN
1957
1991
34
33
FORB
SHARP-LOBED HEPATICA
LAST FLOWER
HENNEPIN
1958
1991
33
32
FORB
SHOOTING STAR
FLOWERING
HENNEPIN
1959
1990
31
31
FORB
SHOWY LADY’S SLIPPER
FLOWERING
HENNEPIN
1960
1991
31
30
FORB
SKUNK CABBAGE
FLOWERING
HENNEPIN
1957
1992
35
34
FORB
SKUNK CABBAGE
LAST FLOWER
HENNEPIN
1959
1991
32
31
FORB
SNOW TRILLIUM
FLOWERING
HENNEPIN
1957
1991
34
34
FORB
SNOW TRILLIUM
LAST FLOWER
HENNEPIN
1959
1991
32
31
FORB
SPREADING DOGBANE
FIRST FALL COLOR
ITASCA
1985
2016
31
31
FORB
STARFLOWER
FLOWERING
ITASCA
1984
2016
32
36
FORB
SWAMP BUTTERCUP
FLOWERING
HENNEPIN
1960
1991
31
31
FORB
SWAMP MILKWEED
FLOWERING
HENNEPIN
1960
1991
31
30
FORB
WHITE TROUT-LILY
FLOWERING
HENNEPIN
1960
1991
31
30
FORB
YELLOW TROUT-LILY
FLOWERING
HENNEPIN
1960
1991
31
30
MAMMALS
WHITE TAILED DEER
FIRST ANTLERS
ITASCA
1984
2016
32
72
WOODY
AMERICAN ELM
FLOWERING
RAMSEY
1941
1991
50
51
WOODY
AMERICAN ELM
LEAF BUDBREAK
RAMSEY
1941
1991
50
51
WOODY
AMERICAN TAMARACK
ALL LEAVES COLORED (GROUP)
ITASCA
1984
2016
32
34
WOODY
AMERICAN TAMARACK
LEAF BUDBREAK
ITASCA
1984
2016
32
45
WOODY
APPLE
FLOWERING
RAMSEY
1941
1991
50
51
WOODY
APPLE
LAST FLOWER
RAMSEY
1941
1991
50
51
WOODY
APPLE
LEAF BUDBREAK
RAMSEY
1941
1991
50
51
WOODY
BEAKED HAZELNUT
FIRST POLLEN VISIBLE
ITASCA
1984
2016
32
39
WOODY
BIG TOOTHED ASPEN
LEAF BUDBREAK
ITASCA
1985
2016
31
36
WOODY
BUR OAK
LEAF BUDBREAK
RAMSEY
1941
1991
50
51
WOODY
COMMON LILAC
FLOWERING
ITASCA
1985
2016
31
31
WOODY
LILAC
FLOWERING
RAMSEY
1941
1991
50
51
WOODY
LILAC
FULL FLOWERING
RAMSEY
1941
1991
50
50
WOODY
LILAC
LEAF BUDBREAK
RAMSEY
1941
1991
50
51
WOODY
PIN CHERRY [FIRE C, BIRD C]
FLOWERING
ITASCA
1984
2015
31
30
WOODY
PUSSY WILLOW
FLOWERING
ITASCA
1984
2016
32
70
WOODY
QUAKING ASPEN
LEAF BUDBREAK
RAMSEY
1941
1991
50
50
WOODY
RED ELDERBERRY
FLOWERING
RAMSEY
1941
1991
50
50
WOODY
RED ELDERBERRY
LEAF BUDBREAK
RAMSEY
1941
1991
50
51
WOODY
RED MAPLE
COLORED LEAVES
ANOKA
1984
2017
33
66
WOODY
SILVER MAPLE
FLOWERING
RAMSEY
1941
1991
50
51
WOODY
SPECKLED ALDER [HOARY A]
FLOWERING
ITASCA
1985
2016
31
35
WOODY
TAMARACK
COLORED NEEDLES
ANOKA
1984
2017
33
39
WOODY
TREMBLING ASPEN
LEAF BUDBREAK
ITASCA
1984
2016
32
49
Table 23.2: Minnesota species with at least 30 years of data for a specific phenophase in the Minnesota Phenology database.
We have already looked at a tree - let’s see if we can broaden the scope of the organisms we investigate and chose examples from different groups of species. For efficiency, we’ll divvy up the work and have everyone chose a different species.
Consider this
Call dibs on the species you would like to investigate. Do a tiny-google and write a 3-5 sentences description of your species. List the phenological questions you could pose about the species you have chosen based on your data set7.
7 If you were writing a report/paper this would be part of your introduction/background section.
Did it!
[Your answer here]
We are going to practice putting together all the components of a data science-esque analysis.
Data Science Process (H.Wickham & Grolemund: R for Data Science
Give it a whirl
Chose the phenophase you want to investigate and go through the process of “Transform-Visualize-Analyze/Model” + “Communicate”.
Transform: Create a subset of your data set that contains only the data for the species, phenophase, and location you have chosen.
Visualize: Plot the change in the timing of the phenophase you have chosen over time.
Model/Analyze: Calculate the rate of change using a linear regression.
Import/Tidy/Transform: Determine what the appropriate temperature data is to match your phenophase data8, the download, and import it.
Visualize: Plot the change in temperature over time.
Model/Analyze: Calculate the reate of change using a linear regression.
Transform: Combine the phenophase and temperature data.
Visualize: Plot the relationship of temperature & phenophase
Model/Analyze: Determine if the relationship of temperature/phenophase is significant.
8 Remember this will include both matching the geographic location & what time of year the temperature should be from, and time span to compare match the years in your data set
Describe what you are doing as you go, i.e. describe your methods9.
Create a final multi-panel figure of your three visualizations and share it with your classmates in our slack channel along with your results (be specific.)
9 Pro Tip: Use the instructions as your starting point, for some of them you would want to add a detail or two specific to your analysis.
Did it!
[Your answer here]
Consider this
Collect the results from all the species/phenophases we have analyzed, this includes the American Elm, your species and those of your classmates and discuss the overall results.
restate the overall broad question/central hypothesis
summarize in 2-3 sentences what data set you used (include all the species/phenophase analysis we’ve done as a class) and how your analyzed it.
summarize the results - which (if any) phenophase now occur earlier/later/have not changed? Which (if any) phenophases are significantly correlated with temperature? How are you reaching these conclusions?
discuss your results - what mechanisms are consistent with the patterns your have observed? Make sure to connect your final conclusion(s) back to your initial question/hypothesis.
Did it!
[Your answer here]
Consider this
A final question we should consider is what the potential impacts of changing phenologies could be. Consider that species within a biological community might be differentially impacted by climate change and that some might be directly impacted by climate change resulting in an altered phenology, while other might be indirectly impacted because of an altered phenology for a species they closely interact with (this is called a population asynchrony or phenological mismatch).
Describe how you could set up an analysis to to test whether one of the species we analyzed might experience a phenological mismatch. Your description should include what how what data you would need and how you would design your analysis.
Did it!
[Your answer here]
23.8 Acknowledgments
These activities are based on the EDDIE Phenology Trends and Climate Change in Minnesota module.10
10 Freeman, P. (2021). Phenology Trends and Climate Change in Minnesota (Project EDDIE).